The Annals of Applied Statistics 

2009, Vol. 3, No. 4, 1295-1298 

DOI: 10.1214/09-AOAS312F 

Main article DOI: 10. 1214/09- AOAS312 

© Institute of Mathematical Statistics, 2009 

DISCUSSION OF: BROWNIAN DISTANCE COVARIANCE 



By Bruno Remillard 
EEC Montreal 

In Szekely, Rizzo and Bakirov (2007), the notions of distance covariance 
and distance correlation between two random vectors were introduced. It 
was shown that the distance covariance is zero if and only if the two vectors 
were independent. An empirical version was also defined and its limiting 
distribution was investigated, under the null hypothesis of independence; 
furthermore, the underlying test based on the empirical version of the dis- 
tance covariance is consistent in the sense that under the hypothesis of 
dependence, its power tends to one as the sample size tends to infinity. 

In the present paper the authors continue the study of the properties of 
the distance covariance and they show that it can be defined in terms of 
covariances of multivariate Brownian processes. They also generalized that 
idea to other stochastic processes, namely, multivariate fractional Brownian 
motions. Defining dependence measures through other stochastic processes 
is quite interesting, but except for the few cases stated in the paper, it is 
still to be proven useful. I encourage the authors to continue to explore that 
interesting idea. Here are some questions I would like to be answered: (i) Can 
other dependence measures be written in that form, for example, Kendall's 
tau? (ii) What are the conditions on the underlying processes so that the 
value of the covariance is zero if and only if the two random vectors are 
independent? (iii) Can you prove a central limit theorem for the empirical 
version and what are the conditions on the underlying stochastic processes 
for the existence of the limiting distribution? 

In what follows I will suggest some other extensions and applications of the 
notion of covariance distance and distance correlation. More precisely, I will 
describe extensions using rank-based methods and suggest two applications 
in a multivariate context, that is, when more than two random vectors are 
involved. 

1. Rank-based methods. In my opinion, there are two weaknesses of the 
distance covariance: The moment assumption on the random vectors and the 
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fact that the dependence measure depends on the marginal distributions. 
That problem can be dealt with easily when the margins are continuous 
by using the associated uniform variables defined through the well-known 
mapping 

X®^U®=F XU) (XU)), j = l,...,p, 
y (fc) ^ y(k) = Fyw (y (fc) )5 k = l,...,q. 

Then, the distance covariance between U = . . . , £/ (p) ) and V = (V^\. . . , 

V^) only depends on the underlying copula of (U, V) and X and Y are in- 
dependent if and only if U and V are independent. Its empirical counterpart 
is simply computed by replacing the observations by their normalized ranks, 
that is, replacing (Xi,Yi) by (Rx,i/n, Ry,i/n), where Rx,ij is the rank of Xij 
among Xy, . . . , X n j, j = 1, . . . ,p. It is relatively easy to prove that the lim- 
iting distribution of nV^(£7, V) will converge to ||£|| 2 , where the covariance 
of £ is Ru,v, as has been defined in Theorem 5. 

On the subject of rank-based methods, I disagree with the authors when 
they say that these methods are effective only for testing linear or mono- 
tone types of dependence. Because independence can also be characterized 
by copulas, and the latter can be efficiently estimated with ranks, their state- 
ment is totally inadequate. See, for example, Genest and Remillard (2004) 
for tests of nonserial and serial dependance based on ranks. Furthermore, 
in their Example 2, the authors suggest that the test based on the distance 
covariance is more powerful that its rank-based analog. Looking at Figure 2, 
this is the case only when the sample size n is quite small (n < 15). I would 
be more convinced by a simulation with different dependence models and 
sample sizes of the order 50 or 100, at the very least. 

2. Measuring dependence between several random vectors. As a com- 
petitor to the distance covariance for tests of independence, it is worth men- 
tioning the Cramer-von Mises statistic nB n , where 

Bn= [ {Fl Y (x,y)-F x (x)F?(y)} 2 dFl Y (x,y) 

is the empirical counterpart of 

B= [ {Fx, Y {x,y) - F x (x)F Y (y)} 2 dF x , Y (x,y). 

Jrp+i 

The latter dependence measure also characterizes independence in the sense 
that B = only when X and Y are independent. 

The limiting distribution of n l / 2 {F x Y {x, y) — F x (x)F Y (y)} used to con- 
struct B n was studied in Beran, Bilodeau and Lafaye de Micheaux (2007). 
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In fact, the authors proposed testing independence between d random vec- 
tors Zi,...,Zd, using statistics based on F n = n 1 ^ 2 {H n (zi, . . . , Zd) — 
Fn,i{z\) ' • ' Fn,d(zd)} i where H n is the empirical joint distribution function of 
(Zi, . . . , Zd), and F n j is the empirical joint distribution of Zj, j G {1, . . . , d}, 
calculated from a sample (Z\\, . . . , Z±d), ■ ■ ■ , (Z n x, . . . , Z nt j). Extending the 
results of Ghoudi, Kulperger and Remillard (2001) from random variables 
to random vectors, Beran, Bilodeau and Lafaye de Micheaux (2007) consid- 
ered tests of nonserial and serial dependence based on Mobius decomposition 
of ¥ n , yielding asymptotically independent empirical processes ¥ nj A (de- 
pending only on the indices in A), for any subset A of {1, . . . , d} containing 
at least two elements. These 2 d — d — 1 processes can be combined to define 
powerful tests of independence [Genest, Quessy and Remillard (2007)]. 

Because the limiting distribution under the null hypothesis depends on 
the unknown distribution function F±, . . . , Fd, Beran, Bilodeau and Lafaye de 
Micheaux (2007) showed that bootstrap methods worked for estimating the 
P- value of underlying test statistics. 

Further, note that Bilodeau and Lafaye de Micheaux (2005) defined tests 
on independence between random vectors based on characteristic functions, 
when the marginal distributions were assumed to be Gaussian. They consid- 
ered both serial and nonserial cases. The Cramer-von Mises type statistics 
they used are quite similar to the statistic nV 2 , when restricted to two ran- 
dom vectors. Therefore, it would be worth considering distance covariance 
measures for measuring independence between several random vectors. In 
order to get nice covariance structures, Mobius transformations of the em- 
pirical characteristic functions should be used. More precisely, for any A C 
{l,...,d}, one could define distance covariance measures V n) A = 1| 2 , 
where 

n 

£ ntA (h ,...,t d ) = n" 1 / 2 II ^ (tk,Zjk) - fx k (**)}■ 

j=l k&A 

3. Measuring dependence for multivariate time series. The distance co- 
variance measures should also be defined in a time series context to measure 
serial dependence. For example, if (2j)j>i is a stationary multivariate time 
series, one can easily define the "distance autocovariance" by 

V 2 (l) = V 2 (Z j ,Z j+l ), 1>1. 

It is easy to show that under the white noise hypothesis and the assump- 
tion that \Zi\ p has finite expectation, 

nV 2 n (l)-^Ui\\ 2 , 

where £i,...,£ m are independent copies of £, as defined in Theorem 5. 
Again, Mobius transformations should be used to test independence between 
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(Z±, . . . , Z m ). Therefore, there are still many interesting avenues to explore, 
especially for time series applications. For example, rank-based methods 
could also be used. See, for example, Genest and Remillard (2004). 

4. Using residuals and pseudo-observations. Finally, one could ask what 
happens when observations are replaced by residuals (or pseudo-observations 
like normalized ranks)? For example, one would like to test independence 
of the error terms in several linear models, using the residuals. Based on 
the results in Ghoudi, Kulperger and Remillard (2001), the limiting dis- 
tribution of riV^ should remain the same, under weak assumptions. That 
should also be true for the multidimensional extensions of the distance co- 
variance. However, replacing the unobservable innovations by residuals in 
multivariate time series models leads to completely different limiting pro- 
cesses. For example, using residuals of a simple AR(1) model of the form 
Z t = (J, + 4>(Z t -i — /i) + £t, one can show that ^V^(Z) converges in law to 
116 - 7i|| 2 ) where 

7l (t,s) = sf(s)f(m<t> 1 - 1 , 

where / is the characteristic function of £j, and cj) n is an estimator of 4> so 
that n l l 2 ((f) n — <fi) converges in law to 

Fortunately, using an analog of the transform \& defined in 
Genest, Ghoudi and Remillard [(2007), page 1373], it might be possible to 
obtain limiting distributions not depending on the estimated parameters. 
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